Àá½Ã¸¸ ±â´Ù·Á ÁÖ¼¼¿ä. ·ÎµùÁßÀÔ´Ï´Ù.
KMID : 1146320150030010035
Journal of Health Technology Assessment
2015 Volume.3 No. 1 p.35 ~ p.47
Development of Open-Source Data Analysis Program for Healthcare Big Data
Park Hyung-Deuk

Lee Sang-Soo
Abstract
Objectives: An era of open and transparent information in Korean healthcare area is now underway. Health Insurance Review and Assessment Service has disclosed the National Patient Sample claims data since 2009 and National Health Insurance Service announced to disclose 9 year period cohort national health insurance claims database to healthcare stakeholders. Since Korea uses the fee-for-service scheme as basic payment system for all of medical treatments excepting for 7 common diseases which are run by Diagnosis Related Group, it is easy to identify the medical treatment practice and resource utility information for individual medical procedures. The use of SAS software is the generally accepted data analysis tool as the average data size of national health insurance claims data easily exceeds over 30 Giga Bytes. However, the data analysis using SAS is labor-intensive and time-consuming works and has a low accessibility due to its costly license fees. As the need to analyze the Healthcare Big Data faster and appropriately rises, demand for development of new data analysis tool is also significantly increasing.

Methods: Open-source big data analysis program with the name of BigPy was developed using Python which is a high-level object oriented programming language. BigPy¡¯s design philosophy emphasizes on code readability and reusability, and its syntax allows users to express concepts in fewer lines of code than would be possible in statistical software such as SAS or R.

Results: Bigpy program is composed of a series of data analysis macro and functions. The functions in BigPy can easily read, trim, sort, and merge the healthcare big data with database format and convert large dataset to a Hierarchical Data Format Version 5 file.

Conclusion: Healthcare stakeholders now have access to promising new value of knowledge that is called Big Data. The efforts on Big Data analysis can address problems related to variability in healthcare quality and consequently improve healthcare treatments. The development of open-source data analysis is a noteworthy and promising methodology to handle the healthcare big data in a rapid and a cost-effective way.
KEYWORD
Healthcare, Big Data, Open-source, National Health Insurance Claims Data
FullTexts / Linksout information
Listed journal information